智能论文笔记

Deep Reinforcement Learning for Optimal Power Flow with Renewables Using Spatial-Temporal Graph Information

Jinhao Li , Ruichang Zhang , Hao Wang , Zhi Liu , Hongyang Lai , Yanru Zhang

分类：机器学习 | 人工智能

2021-12-22

可再生能源资源（RERS）已越来越纳入现代电力系统，尤其是在大规模分配网络（DNS）中。在本文中，我们提出了一种深度加强学习（DRL）基础的方法来动态搜索最佳操作点，即最佳功率流（OPF），在具有高摄取RER的DNS中。考虑到由RERS引起的不确定性和电压波动问题，我们将OPF分为多目标优化（MOO）问题。为了解决MOO问题，我们开发了一种利用分发网络图形信息的新型DRL算法。具体而言，我们采用最先进的DRL算法，即深度确定性政策梯度（DDPG），以学习OPF的最佳策略。由于DN中的电力流重新分配是连续的过程，其中节点是在时间和空间视图中自相关和相互关联的，以充分利用DNS的图形信息，我们开发了一种基于多粒的关注的空间 - 时间图卷积用于空间颞曲线图信息提取的网络（MG-ASTGCN），为其顺序DDPG准备。我们在修改IEEE 33,69和118总线径向分布系统（RDS）中验证了基于DRL的基于DRL的方法，并显示了基于DRL的方法优于其他基准算法。我们的实验结果还揭示了MG-ASTGCN可以显着加速DDPG训练过程，并提高DDPG在重新分配OPF电流中的能力。所提出的基于DRL的方法还促进了节点故障存在下的DNS的稳定性，特别是对于大型DNS。

translated by 谷歌翻译

CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

Jingyang Lin , Yingwei Pan , Rongfeng Lai , Xuehang Yang , Hongyang Chao , Ting Yao

分类：计算机视觉 | 人工智能

2021-12-14

本地化自然场景中的文本实例被认为是计算机愿景中的根本挑战。尽管如此，由于实际场景中的极其方向性和文本实例的尺度，大多数传统的文本检测器都遭受子文本问题，该问题仅定位文本实例的片段（即，子文本）。在这项工作中，我们定量分析了子文本问题，并提出了一种简单但有效的设计，对比关系（核心）模块，以减轻该问题。核心首先利用Vanilla关系块来模拟所有文本提案中的关系（多个文本实例的子文本），并以对比的方式进一步通过实例级子文本鉴别来增强关系推理。这种方式自然地学习了文本提案的实例感知表示，从而促进了场景文本检测。我们将核心模块集成到蒙版R-CNN的两级文本检测器中，并设计了我们的文本探测器核心文本。四个基准测试的广泛实验证明了核心文本的优越性。代码可用：\ url {https://github.com/jylins/core-text}。

translated by 谷歌翻译

Policy Pre-training for End-to-end Autonomous Driving via Self-supervised Geometric Modeling

Penghao Wu , Li Chen , Hongyang Li , Xiaosong Jia , Junchi Yan , Yu Qiao

分类：计算机视觉

2023-01-03

Witnessing the impressive achievements of pre-training techniques on large-scale data in the field of computer vision and natural language processing, we wonder whether this idea could be adapted in a grab-and-go spirit, and mitigate the sample inefficiency problem for visuomotor driving. Given the highly dynamic and variant nature of the input, the visuomotor driving task inherently lacks view and translation invariance, and the visual input contains massive irrelevant information for decision making, resulting in predominant pre-training approaches from general vision less suitable for the autonomous driving task. To this end, we propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving. We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos. The proposed PPGeo is performed in two stages to support effective self-supervised training. In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input. In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only. As such, the pre-trained visual encoder is equipped with rich driving policy related representations and thereby competent for multiple visuomotor driving tasks. Extensive experiments covering a wide span of challenging scenarios have demonstrated the superiority of our proposed approach, where improvements range from 2% to even over 100% with very limited data. Code and models will be available at https://github.com/OpenDriveLab/PPGeo.

translated by 谷歌翻译

Online Statistical Inference for Contextual Bandits via Stochastic Gradient Descent

Xi Chen , Zehua Lai , He Li , Yichen Zhang

分类： (统计)机器学习 | 机器学习

2022-12-30

With the fast development of big data, it has been easier than before to learn the optimal decision rule by updating the decision rule recursively and making online decisions. We study the online statistical inference of model parameters in a contextual bandit framework of sequential decision-making. We propose a general framework for online and adaptive data collection environment that can update decision rules via weighted stochastic gradient descent. We allow different weighting schemes of the stochastic gradient and establish the asymptotic normality of the parameter estimator. Our proposed estimator significantly improves the asymptotic efficiency over the previous averaged SGD approach via inverse probability weights. We also conduct an optimality analysis on the weights in a linear regression setting. We provide a Bahadur representation of the proposed estimator and show that the remainder term in the Bahadur representation entails a slower convergence rate compared to classical SGD due to the adaptive data collection.

translated by 谷歌翻译

Goal-oriented Autonomous Driving

Yihan Hu , Jiazhi Yang , Li Chen , Keyu Li , Chonghao Sima , Xizhou Zhu , Siqi Chai , Senyao Du , Tianwei Lin , Wenhai Wang

分类：计算机视觉 | 机器人

2022-12-20

Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction and planning. As sensors and hardware get improved, there is trending popularity to devise a system that can perform a wide diversity of tasks to fulfill higher-level intelligence. Contemporary approaches resort to either deploying standalone models for individual tasks, or designing a multi-task paradigm with separate heads. These might suffer from accumulative error or negative transfer effect. Instead, we argue that a favorable algorithm framework should be devised and optimized in pursuit of the ultimate goal, i.e. planning of the self-driving-car. Oriented at this goal, we revisit the key components within perception and prediction. We analyze each module and prioritize the tasks hierarchically, such that all these tasks contribute to planning (the goal). To this end, we introduce Unified Autonomous Driving (UniAD), the first comprehensive framework up-to-date that incorporates full-stack driving tasks in one network. It is exquisitely devised to leverage advantages of each module, and provide complementary feature abstractions for agent interaction from a global perspective. Tasks are communicated with unified query design to facilitate each other toward planning. We instantiate UniAD on the challenging nuScenes benchmark. With extensive ablations, the effectiveness of using such a philosophy is proven to surpass previous state-of-the-arts by a large margin in all aspects. The full suite of codebase and models would be available to facilitate future research in the community.

translated by 谷歌翻译

Fast Converging Anytime Model Counting

Yong Lai , Kuldeep S. Meel , Roland H. C. Yap

分类：人工智能

2022-12-19

Model counting is a fundamental problem which has been influential in many applications, from artificial intelligence to formal verification. Due to the intrinsic hardness of model counting, approximate techniques have been developed to solve real-world instances of model counting. This paper designs a new anytime approach called PartialKC for approximate model counting. The idea is a form of partial knowledge compilation to provide an unbiased estimate of the model count which can converge to the exact count. Our empirical analysis demonstrates that PartialKC achieves significant scalability and accuracy over prior state-of-the-art approximate counters, including satss and STS. Interestingly, the empirical results show that PartialKC reaches convergence for many instances and therefore provides exact model counting performance comparable to state-of-the-art exact counters.

translated by 谷歌翻译

Multi-embodiment Legged Robot Control as a Sequence Modeling Problem

Chen Yu , Weinan Zhang , Hang Lai , Zheng Tian , Laurent Kneip , Jun Wang

分类：机器人

2022-12-18

Robots are traditionally bounded by a fixed embodiment during their operational lifetime, which limits their ability to adapt to their surroundings. Co-optimizing control and morphology of a robot, however, is often inefficient due to the complex interplay between the controller and morphology. In this paper, we propose a learning-based control method that can inherently take morphology into consideration such that once the control policy is trained in the simulator, it can be easily deployed to robots with different embodiments in the real world. In particular, we present the Embodiment-aware Transformer (EAT), an architecture that casts this control problem as conditional sequence modeling. EAT outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired robot embodiment, past states, and actions, our EAT model can generate future actions that best fit the current robot embodiment. Experimental results show that EAT can outperform all other alternatives in embodiment-varying tasks, and succeed in an example of real-world evolution tasks: stepping down a stair through updating the morphology alone. We hope that EAT will inspire a new push toward real-world evolution across many domains, where algorithms like EAT can blaze a trail by bridging the field of evolutionary robotics and big data sequence modeling.

translated by 谷歌翻译

Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games

Bolin Lai , Hongxin Zhang , Miao Liu , Aryan Pariani , Fiona Ryan , Wenqi Jia , Shirley Anugrah Hayati , James M. Rehg , Diyi Yang

分类：机器学习 | 自然语言处理 | 计算机视觉

2022-12-16

Persuasion modeling is a key building block for conversational agents. Existing works in this direction are limited to analyzing textual dialogue corpus. We argue that visual signals also play an important role in understanding human persuasive behaviors. In this paper, we introduce the first multimodal dataset for modeling persuasion behaviors. Our dataset includes 199 dialogue transcriptions and videos captured in a multi-player social deduction game setting, 26,647 utterance level annotations of persuasion strategy, and game level annotations of deduction game outcomes. We provide extensive experiments to show how dialogue context and visual signals benefit persuasion strategy prediction. We also explore the generalization ability of language models for persuasion modeling and the role of persuasion strategies in predicting social deduction game outcomes. Our dataset, code, and models can be found at https://persuasion-deductiongame.socialai-data.org.

translated by 谷歌翻译

DeepDFA: Dataflow Analysis-Guided Efficient Graph Learning for Vulnerability Detection

Benjamin Steenhoek , Wei Le , Hongyang Gao

分类：机器学习

2022-12-15

Deep learning-based vulnerability detection models have recently been shown to be effective and, in some cases, outperform static analysis tools. However, the highest-performing approaches use token-based transformer models, which do not leverage domain knowledge. Classical program analysis techniques such as dataflow analysis can detect many types of bugs and are the most commonly used methods in practice. Motivated by the causal relationship between bugs and dataflow analysis, we present DeepDFA, a dataflow analysis-guided graph learning framework and embedding that uses program semantic features for vulnerability detection. We show that DeepDFA is performant and efficient. DeepDFA ranked first in recall, first in generalizing over unseen projects, and second in F1 among all the state-of-the-art models we experimented with. It is also the smallest model in terms of the number of parameters, and was trained in 9 minutes, 69x faster than the highest-performing baseline. DeepDFA can be used with other models. By integrating LineVul and DeepDFA, we achieved the best vulnerability detection performance of 96.4 F1 score, 98.69 precision, and 94.22 recall.

translated by 谷歌翻译

Sim-to-Real Transfer for Quadrupedal Locomotion via Terrain Transformer

Hang Lai , Weinan Zhang , Xialin He , Chen Yu , Zheng Tian , Yong Yu , Jun Wang

分类：机器人 | 机器学习

2022-12-15

Deep reinforcement learning has recently emerged as an appealing alternative for legged locomotion over multiple terrains by training a policy in physical simulation and then transferring it to the real world (i.e., sim-to-real transfer). Despite considerable progress, the capacity and scalability of traditional neural networks are still limited, which may hinder their applications in more complex environments. In contrast, the Transformer architecture has shown its superiority in a wide range of large-scale sequence modeling tasks, including natural language processing and decision-making problems. In this paper, we propose Terrain Transformer (TERT), a high-capacity Transformer model for quadrupedal locomotion control on various terrains. Furthermore, to better leverage Transformer in sim-to-real scenarios, we present a novel two-stage training framework consisting of an offline pretraining stage and an online correction stage, which can naturally integrate Transformer with privileged training. Extensive experiments in simulation demonstrate that TERT outperforms state-of-the-art baselines on different terrains in terms of return, energy consumption and control smoothness. In further real-world validation, TERT successfully traverses nine challenging terrains, including sand pit and stair down, which can not be accomplished by strong baselines.

translated by 谷歌翻译